MODULE 7:   Graph Theory
Matchings in Graphs

In this chapter, we continue our discussion on graphs and turn our attention to finding matchings in graphs. Algorithms to find matchings are used a lot in real-world applications, and we discuss some of these applications in lecture. This chapter will expand your toolkit for reasoning about graphs and help you build more intuition about them.

1
Maximum Matchings
Definition (Matching – maximum, maximal, perfect)

A matching in a graph G=(V,E)G=(V,E) is a subset of the edges that do not share an endpoint. A maximum matching in GG is a matching with the maximum number of edges among all possible matchings. A maximal matching is a matching with the property that if we add any other edge to the matching, it is no longer a matching. A perfect matching is a matching that covers all the vertices of the graph.

Example (Examples of matchings)

Consider the following graph.

image

Note that the empty set and a set with only one edge is always a matching. The set M={{v1,v5},{v4,v7}}M = \{\{v_1, v_5\}, \{v_4,v_7\}\} is a maximal matching with 2 edges, since we if we tried to add another edge to this set, it would no longer be a matching. On the other hand, this maximal matching is not a maximum matching because there is another matching with 3 edges: M={{v1,v6},{v3,v5},{v4,v7}}M' = \{ \{v_1,v_6\}, \{v_3,v_5\}, \{v_4,v_7\} \}. This graph does not have a perfect matching. One easy way to see this is that it has an odd number of vertices, and any graph with an odd number of vertices cannot have a perfect matching.

Note (Size of a matching)

The size of a matching MM refers to the number of edges in the matching, and is denoted by M|M|. Note that this coincides with the size of the set that MM represents.

Definition (Maximum matching problem)

In the maximum matching problem the input is an undirected graph G=(V,E)G=(V,E) and the output is a maximum matching in GG.

Definition (Augmenting path)

Let G=(V,E)G = (V,E) be a graph and let MEM \subseteq E be a matching in GG. An augmenting path in GG with respect to MM is a path such that

  1. the path is an alternating path, which means that the edges in the path alternate between being in MM and not in MM,

  2. the first and last vertices in the path are not a part of the matching MM.

Example (Augmenting path example 1)

Consider a single edge {u,v}\{u,v\} in a graph GG such that uu and vv are not matched by a matching MM (this means that vertices uu and vv do not belong to any of the edges in MM). Then this edge forms an augmenting path.

Example (Augmenting path example 2)

Consider the following graph.

image

Let MM be the matching {{v1,v5},{v3,v6},{v4,v8}}\{\{v_1,v_5\}, \{v_3,v_6\}, \{v_4,v_8\}\}. Then the path (v2,v5,v1,v7)(v_2,v_5,v_1,v_7) is an augmenting path with respect to MM.

Note (Edge cases for augmenting paths)

An augmenting path does not need to contain all the edges in MM. It is also possible that it does not contain any of the edges of MM. A single edge {u,v}\{u, v\}, where uu is not matched and vv is not matched, is an augmenting path.

Theorem (Characterization for maximum matchings)

Let G=(V,E)G=(V,E) be a graph. A matching MEM \subseteq E is maximum if and only if there is no augmenting path in GG with respect to MM.

Proof

The statement we want to prove is equivalent to the following. Given a graph G=(V,E)G = (V,E), a matching MEM \subseteq E is not maximum if and only if there is an augmenting path in GG with respect to MM. There are two directions to prove.

First direction: Suppose there is an augmenting path in GG with respect to MM. Then we want to show that MM is not maximum. Let the augmenting path be v1,v2,,vkv_1,v_2,\ldots,v_k:

image

The highlighted edges represent edges in MM. By the definition of an augmenting path, we know that v1v_1 and vkv_k are not matched by MM. Since v1v_1 and vkv_k are not matched and the path is alternating, the number of edges on this path that are in the matching is one less than the number of edges not in the matching. To see that MM is not a maximum matching, observe that we can obtain a bigger matching by flipping the matched and unmatched edges on the augmenting path. In other words, if an edge on the path is in the matching, we remove it from the matching, and if an edge on the path is not in the matching, we put it in the matching. This gives us a matching larger than MM, so MM is not maximum.

Second direction: We now prove the other direction. In particular, we want to show that if MM is not a maximum matching, then we can find an augmenting path in GG with respect to MM. Since MM is not a maximum, there is a matching with larger size. Let MM^* be a matching such that M>M|M^*| > |M|. We define the set SS to be the set of edges contained in MM^* or MM, but not both. That is, S=(MM)\(MM)S = (M^* \cup M) \backslash (M^* \cap M) (this is the symmetric difference of MM^* and MM). If we color the edges in MM with blue, and the edges in MM^* with red, then SS consists of edges that are colored either blue or red, but not both (i.e. no purple edges). Below is an example:

image

(Horizontal edges correspond to the red edges. The rest is blue.) Our goal is to find an augmenting path with respect to MM in SS (i.e., with respect to the blue edges), and once we do this, the proof will be complete.

We now proceed to find an augmenting path with respect to MM in SS. To do so, we make a couple of important observations about SS. First, notice that each vertex that is a part of SS has degree 11 or 22 because it can be incident to at most one edge in MM and at most one edge in MM^*. (If the degree was more than 22, either two edges from MM or two edges from MM^* would be intersecting, but in a matching, edges cannot intersect.) We now make two claims:

  1. Because every vertex has degree 11 or 22, SS consists of disjoint paths and cycles (i.e. each connected component is either a path or a cycle).

    image

  2. The edges in these paths and cycles alternate between blue and red.

The proof of the first claim is omitted and is left as an exercise for the reader. The second claim is true because if the edges were not alternating, i.e., if there were two red or two blue edges in a row, then this would imply the red edges or the blue edges do not form a matching (remember that in a matching no two edges can share an endpoint).

Since MM^* is a bigger matching than MM, we know that SS has more red edges than blue edges. Observe that the cycles in SS must have even length, because otherwise the edges cannot alternate between blue and red. Therefore the cycles have an equal number of red and blue edges. This implies that there must be a path in SS with more red edges than blue edges. In particular, this path starts and ends with a red edge. This path is an augmenting path with respect to MM (i.e., the blue edges), since it is clearly alternating between edges in MM and edges not in MM, and the endpoints are unmatched with respect to MM. So using the assumption that MM is not maximum, we were able to find an augmenting path with respect to MM. This completes the proof.

Exercise (Graphs with max degree at most 2)

Let G=(V,E)G = (V,E) be a graph such that all vertices have degree at most 22. Then prove that every connected component of GG is either a path or a cycle (where we count an isolated vertex as a path of length 0).

Solution

Consider a graph GG such that all vertices have degree at most 2. We want to show that it consists of disjoint paths and cycles. We prove this by induction on the number of vertices.

Pick an arbitrary vertex vv in the graph. Removing vv results in a graph GvG-v such that every vertex has degree at most 2. Since GvG-v has one less vertex, by induction hypothesis, GvG-v consists of disjoint paths and cycles. There are 3 cases to consider: deg(v)=0,1,\deg(v) = 0, 1, or 22. It is not hard to see that in each case, adding vv back to the graph preserves the property that the graph is a collection of disjoint paths and cycles. (Verify this part for yourself.)

Exercise (A tree can have at most one perfect matching)

Show that a tree can have at most one perfect matching.

Hint

Assume, for the sake of contradiction, there are two perfect matchings MM and MM'. Then as in the proof of Theorem (Characterization for maximum matchings), look at the symmetric difference between these two matchings.

Solution

The proof is by contradiction, so suppose a tree has two different perfect matchings MM and MM'. As in the proof of Theorem (Characterization for maximum matchings), we’ll consider the symmetric difference between these two matchings. So let SS be the symmetric difference between MM and MM', i.e., S=(MM)\(MM)S = (M \cup M') \backslash (M \cap M'). Since M=M|M| = |M'| and MMM \neq M', we must have S>1|S| > 1 (there must be an edge in MM that is not in MM', and vice versa). The set SS corresponds to a graph in which each vertex has degree at most 22. So this graph consists of disjoint paths and cycles (i.e. each connected component is either a path or a cycle). But a connected component cannot be a cycle since trees are acyclic. It also cannot be a path. This is because the existence of a degree 1 vertex in SS implies that this vertex is not covered/matched by either MM or MM' (verify this yourself), and this would contradict the fact that MM and MM' are perfect matchings covering all vertices. So SS must be the empty set, which contradicts our assumption that S>1|S| > 1.

2
Bipartite Graphs
Definition (Bipartite graph)

A graph G=(V,E)G = (V,E) is called bipartite if there is a partition of VV into sets XX and YY such that all the edges in EE have one endpoint in XX and the other in YY. Sometimes the bipartition is given explicitly and the graph is denoted by G=(X,Y,E)G = (X, Y, E).

Example (Bipartite graph example)

Below is an example of a bipartite graph.

image

Definition (kk-colorable graphs)

Let G=(V,E)G = (V,E) be a graph. Let kN+k \in \mathbb{N}^+. A kk-coloring of VV is just a map χ:VC\chi : V \to C where CC is a set of cardinality kk. (Usually the elements of CC are called colors. If k=3k = 3 then C={red,green,blue}C = \{\text{red},\text{green},\text{blue}\} is a popular choice. If kk is large, we often just call the “colors” 1,2,,k1,2, \dots, k.) A kk-coloring is said to be legal for GG if every edge in EE is bichromatic, meaning that its two endpoints have different colors. (I.e., for all {u,v}E\{u,v\} \in E it is required that χ(u)χ(v)\chi(u)\neq\chi(v).) Finally, we say that GG is kk-colorable if it has a legal kk-coloring.

Example (A 33-colorable graph)

The graph below is 33-colorable. We can color the vertex at the center green, and color the outer vertices with blue and red by alternating those two colors.

image

Note (2-colorability is equivalent to bipartiteness)

A graph G=(V,E)G=(V,E) is bipartite if and only if it is 22-colorable. The 22-coloring corresponds to partitioning the vertex set VV into XX and YY such that all the edges have one endpoint in XX and the other in YY.

Theorem (Characterization of bipartite graphs)

A graph is bipartite if and only if it contains no odd-length cycles.

Proof

There are two directions to prove.

(\Longrightarrow): For this direction, we want to show that if a graph is bipartite, then it contains no odd-length cycles. We prove the contrapositive. Observe that it is impossible to 22-color an odd-length cycle. So if a graph contains an odd-length cycle, the graph cannot be 22-colored, and therefore cannot be bipartite.

(\Longleftarrow): For this direction, we want to show that if a graph does not contain an odd-length cycle, then it is bipartite. So suppose the graph contains no cycles of odd length. Without loss of generality, assume the graph is connected (if it is not, we can apply the argument to each connected component separately). For u,vVu,v \in V, let dist(u,v)\text{dist}(u,v) denote the length of the shortest path from uu to vv (or from vv to uu). Pick a starting vertex/root ss and consider the “BFS tree” rooted at ss. In this tree, level 0 corresponds to ss, and level ii corresponds to all vertices vv with dist(s,v)=i\text{dist}(s,v) = i. Color odd-indexed levels blue, and color even-indexed levels red.

The proof is done once we show that this is a valid 22-coloring of the graph. To show this, we’ll argue that no edge has its endpoints colored the same color. There are two types of edges we need to worry about that could potentially have its endpoints colored the same color. We consider each type below.

First, there could potentially be an edge between two vertices uu and vv at the same level. Let’s assume such an edge exists. Let ww be the lowest common ancestor of uu and vv. Note that dist(u,w)=dist(v,w)\text{dist}(u,w) = \text{dist}(v,w), so the path from ww to uu, plus the path from ww to vv, plus the edge {u,v}\{u,v\}, form an odd-length cycle. This is a contradiction.

Second, we need to consider the possibility that there is an edge between a vertex uu at level ii and a vertex vv at level i+2ki + 2k for some k>0k > 0. However, the existence of such an edge implies that dist(s,v)i+1\text{dist}(s,v) \leq i+1, which contradicts the fact that vv is at level i+2ki+2k. So this type of edge cannot exist either. This completes the proof.

Theorem (Finding a maximum matching in bipartite graphs)

There is a polynomial time algorithm to solve the maximum matching problem in bipartite graphs.

Proof

Let G=(X,Y,E)G=(X,Y,E) be the input graph. The high level steps of the algorithm is as follows.

  • Let M={{x,y}}M = \{ \{x,y\} \} where {x,y}E\{x,y\} \in E is an arbitrary edge.

  • Repeat until there is no augmenting path with respect to MM:

    • Find an augmenting path with respect to MM.

    • Update MM according to the augmenting path (swapping matched and unmatched edges along the path).

Every time we find an augmenting path, we increase the size of our matching by one. When there are no more augmenting paths, we stop and correctly output a maximum matching (the correctness follows from Theorem (Characterization for maximum matchings)). The only unclear step of the algorithm is finding an augmenting path with respect to MM. And we explain how to do this step below. But before we do that, note that if this step can be done in polynomial time, then the overall running time of the algorithm is polynomial time since the loop repeats O(n)O(n) times and the work done in each iteration is polynomial time.

We now show how to find an augmenting path (given G=(X,Y,E)G = (X,Y,E) and MEM \subseteq E). We first turn GG into a directed graph G\vec{G} as follows:

  • Direct edges in E\ME \backslash M from XX to YY.

  • Direct edges in MM from YY to XX.

Note that an alternating path with respect to MM in GG corresponds to a directed path in G\vec{G}. (We leave it to the reader to verify this.) This means that there is an augmenting path with respect to MM in GG if and only if there is a directed path in G\vec{G} from an unmatched vertex xx in XX to an unmatched vertex yy in YY. So to find an augmenting path, we’ll search for a directed path in G\vec{G} from an unmatched xx to an unmatched yy, as follows:

  • For each unmatched xXx \in X:

    • Run DFS(G\vec{G}, xx).

    • If you hit an unmatched yYy \in Y, output the path from xx to yy.

  • Output “no augmenting path found.”

The running time is polynomial time since the loop repeats at most O(n)O(n) times, and the work done in each iteration is polynomial time.

Note (Finding a maximum matching in non-bipartite graphs)

The high-level algorithm above presented in the proof of Theorem (Finding a maximum matching in bipartite graphs) is in fact applicable to general (not necessarily bipartite) graphs. However, the step of finding an augmenting path with respect to a matching turns out to be much more involved, and therefore we do not cover it in this chapter. See https://en.wikipedia.org/wiki/Blossom_algorithm if you would like to learn more.

3
Bonus: Hall’s Theorem
Theorem (Hall’s Theorem)

Let G=(X,Y,E)G = (X,Y,E) be a bipartite graph. For a subset SS of the vertices, let N(S)=vSN(v)N(S) = \bigcup_{v \in S} N(v). Then GG has a matching covering all the vertices in XX if and only if for all SXS \subseteq X, we have SN(S)|S| \leq |N(S)|.

Proof

There are two directions to prove.

(\Longrightarrow): For this direction, we need to show that if GG has a matching covering all the vertices in XX, then every SXS \subseteq X satisfies SN(S)|S| \leq |N(S)|. We consider the contrapositive. So suppose there is some SXS \subseteq X such that S>N(S)|S| > |N(S)|. The vertices in SS can only be matched to vertices in N(S)N(S), and since S>N(S)|S| > |N(S)|, there cannot be a matching that covers every element in SS. And this implies there cannot be a matching covering every element of XX.

(\Longleftarrow): For this direction, we need to show that if every SXS \subseteq X satisfies SN(S)|S| \leq |N(S)|, then there is a matching that covers all the vertices in XX. We will prove the contrapositive. So assume there is no matching that covers all the vertices in XX. Our goal is to find some SXS \subseteq X such that S>N(S)|S| > |N(S)|.

In order to identify such a set SS, we need to make a couple of definitions. Let MM be a maximum matching and let xXx \in X be an element that it does not cover. We turn GG into a directed graph as follows: direct all edges not in MM from XX to YY, and direct all edges in MM from YY to XX. We define LXL \subseteq X to be the set of vertices in XX that you can reach by a directed path starting at xx (LL does not include xx). And we define RYR \subseteq Y to be the set of all vertices in YY that you can reach by a directed path starting at xx. Here is an illustration:

image

We will show that for S=L{x}S = L \cup \{x\}, we have S>N(S)|S| > |N(S)|. We need two claims to argue this.

Claim 1: L=R|L| = |R|.
Proof: Each L\ell \in L is matched to some rRr \in R because the only way we can reach an L\ell \in L is through an edge in the matching. Conversely, each rRr \in R must be matched to some L\ell \in L since if this was not true, i.e., if there was an unmatched rRr \in R, that would imply that the path from xx to rr is an augmenting path, and this would contradict the fact that MM is a maximum matching (Theorem (Characterization for maximum matchings)). Since every element of LL is matched by MM to an element of RR and vice versa, there is a one-to-one correspondence between LL and RR, i.e., L=R|L| = |R|.

Claim 2: In the original undirected graph, N(L{x})RN(L \cup \{x\}) \subseteq R.
(In fact, N(L{x})=RN(L \cup \{x\}) = R but we only need one side of the inclusion.)
Proof: For any L{x}\ell \in L \cup \{x\}, we want to argue that N()RN(\ell) \subseteq R. First consider the case that =x\ell = x. Then all the neighbors of xx must be in RR since all the edges incident to xx are directed from left to right. So N(x)RN(x) \subseteq R. Now consider any L\ell \in L. We want to argue that all the neighbors of \ell must be in RR. To argue about the neighbors of \ell, we look at all the edges incident to \ell. In this set of edges incident to \ell, exactly one edge ee is in the matching MM. Since eMe \in M is directed from YY to XX, it must be incident to some rRr \in R because the only way to reach \ell is through some rRr \in R via ee. If we now look at all the other edges incident to \ell, note that they must be directed from XX to YY, and the vertices KYK \subseteq Y that they are incident to must be in RR. This is because by definition of LL, L\ell \in L is reachable from xx, which means the vertices in KK would also be reachable from xx, and therefore would be in RR (by the definition of RR). This shows that every neighbor of \ell is in RR, and completes the proof of Claim 2.

Combining Claim 1 and Claim 2 above, we have L{x}>RN(L{x}),|L \cup \{x\}| > |R| \geq |N(L \cup \{x\})|, i.e., for S=L{x}S = L \cup \{x\}, S>N(S)|S| > |N(S)|, as desired.

Corollary (Characterization of bipartite graphs with perfect matchings)

Let G=(X,Y,E)G = (X,Y,E) be a bipartite graph. Then GG has a perfect matching if and only if X=Y|X| = |Y| and for any SXS \subseteq X, we have SN(S)|S| \leq |N(S)|.

Note (Hall’s Theorem when the two parts have equal size)

Sometimes people call the above corollary Hall’s Theorem.

Exercise (Practice with perfect matchings)
  1. Let GG be a bipartite graph on 2n2n vertices such that every vertex has degree at least nn. Prove that GG must contain a perfect matching.

  2. Let G=(X,Y,E)G = (X,Y,E) be a bipartite graph with X=Y|X| = |Y|. Show that if GG is connected and every vertex has degree at most 22, then GG must contain a perfect matching.

Solution

Part 1: If every vertex has degree nn, it must be the case that the graph is G=(X,Y,E)G=(X,Y,E) where X=Y=n|X| = |Y| = n. It must also be the case that the graph is a complete bipartite graph (i.e., every possible edge is present). So clearly Hall’s theorem applies and the graph has a perfect matching.

Part 2: A graph with max degree at most 2 consists of disjoint paths and cycles (Exercise (Graphs with max degree at most 2)). Since the graph is connected, it must consist of a single path or a single cycle containing all the vertices. In either case, it is not hard to see that the graph must contain a perfect matching: If the graph is a path, then we can take every other edge (starting with the first edge) along the path to form a perfect matching. If the graph consists of a cycle, it has two different perfect matchings.

4
Check Your Understanding
Problem
  1. True or false: A maximum matching is a maximal matching.

  2. True or false: A perfect matching is a maximum matching.

  3. True or false: There exist graphs with more than one perfect matching.

  4. How many perfect matchings are there in a complete bipartite graph (all possible edges are present) where both sides of the bipartition contain exactly nn vertices?

  5. Suppose a graph with nn vertices has a perfect matching. What is the size of the perfect matching?

  6. What is the definition of an augmenting path?

  7. True or false: A single edge {u,v}\{u, v\}, where uu is not matched and vv is not matched, is an augmenting path.

  8. True or false: Given a matching MM, there can be at most one augmenting path with respect to MM.

  9. True or false: A matching MM in a non-bipartite graph GG is maximum if and only if there is no augmenting path with respect to MM.

  10. Describe the high-level steps of a polynomial-time algorithm for finding a maximum matching in a graph.

  11. Describe a polynomial-time algorithm that given a bipartite graph and a matching in the graph, determines if an augmenting path exists with respect to the matching.

  12. True or false: The graph below is bipartite.

    image

  13. True or false: If a graph is kk-colorable, then it is (k+1)(k+1)-colorable.

  14. True or false: A graph which is not bipartite must contain an odd-length cycle.

  15. Explain how one can 22-color a graph that does not contain an odd-length cycle.

  16. Describe a polynomial-time algorithm to determine if an input undirected graph is bipartite or not.

  17. How do you prove that a tree has at most one perfect matching?

  18. True or false: Any graph with more than one perfect matching must contain a cycle.

  19. In this chapter we saw the definition of a bipartite graph. We can think of a bipartite graph as a special case of a kk-partite graph where k=2k = 2. How would you define the general notion of a kk-partite graph, and how does it relate to the colorability of the graph?

5
High-Order Bits
Important Note

Here are the important things to keep in mind from this chapter.

  1. As always, all the definitions in this chapter are important (matchings, augmenting path, bipartite graph, kk-coloring of a graph).

  2. Theorem (Characterization of bipartite graphs) gives an important characterization for bipartite graphs. Make sure to understand its proof.

  3. Theorem (Characterization for maximum matchings) gives an important characterization for maximum matchings. The proof is one of the harder proofs we have done in the course so far. A good understanding of the proof is expected and important since it will help you raise your level of mathematical maturity.

  4. Make sure to understand how we can use Theorem (Characterization for maximum matchings) to come up with an algorithm for finding a maximum matching in a graph.

Note that a maximal matching is not necessarily a maximum matching, but a maximum matching is always a maximal matching.
Recall that a partition of \(V\) into \(X\) and \(Y\) means that \(X\) and \(Y\) are disjoint and \(X \cup Y = V\).
Theorem (Characterization for maximum matchings)

Let \(G=(V,E)\) be a graph. A matching \(M \subseteq E\) is maximum if and only if there is no augmenting path in \(G\) with respect to \(M\).

See in chapter
Theorem (Characterization for maximum matchings)

Let \(G=(V,E)\) be a graph. A matching \(M \subseteq E\) is maximum if and only if there is no augmenting path in \(G\) with respect to \(M\).

See in chapter
Theorem (Characterization for maximum matchings)

Let \(G=(V,E)\) be a graph. A matching \(M \subseteq E\) is maximum if and only if there is no augmenting path in \(G\) with respect to \(M\).

See in chapter
Theorem (Finding a maximum matching in bipartite graphs)

There is a polynomial time algorithm to solve the maximum matching problem in bipartite graphs.

See in chapter
Theorem (Characterization for maximum matchings)

Let \(G=(V,E)\) be a graph. A matching \(M \subseteq E\) is maximum if and only if there is no augmenting path in \(G\) with respect to \(M\).

See in chapter
Exercise (Graphs with max degree at most 2)

Let \(G = (V,E)\) be a graph such that all vertices have degree at most \(2\). Then prove that every connected component of \(G\) is either a path or a cycle (where we count an isolated vertex as a path of length 0).

See in chapter
Theorem (Characterization of bipartite graphs)

A graph is bipartite if and only if it contains no odd-length cycles.

See in chapter
Theorem (Characterization for maximum matchings)

Let \(G=(V,E)\) be a graph. A matching \(M \subseteq E\) is maximum if and only if there is no augmenting path in \(G\) with respect to \(M\).

See in chapter
Theorem (Characterization for maximum matchings)

Let \(G=(V,E)\) be a graph. A matching \(M \subseteq E\) is maximum if and only if there is no augmenting path in \(G\) with respect to \(M\).

See in chapter